seo

Matt Cutts’ Videos — Take 2!

Boy, what did I get myself into? Rand has chained me to my desk and ordered me to summarize the next set of videos released by Matt Cutts. This new batch of summaries is more verbatim than the last one, so enjoy.

1.Β  Does Google treat dynamic pages differently than static pages?

Google does static and dynamic pages in similar ways of ranking. Page rank flows to dynamic URLs in the same way they flow to static URLs. Matt provides an example where if you have the NY Times linking to a dynamic URL, you’ll still get the page rank benefit and it will still flow the page rank benefit

There are other search engines who in the past have said, β€œOkay, we’ll go one level deep from static URLs, so we’re not gonna crawl from a dynamic URL, but we’re willing to go into the dynamic URL space from a static URL.” So, the short answer is that page rank stillΒ flows the same between static and dynamic.

Β Matt provides a more detailed answer as well. The example the question asker gave had five parameters, and one of them was a product id with “2725.” Matt maintains that you definitely can use too many parameters. He would opt for two or three at the most if you have any choice whatsoever. Also, try to avoid long numbers because Google can think that those are session ids. It’s a good idea to get rid of any extra parameters.

Remember that Google is not the only search engine out there, so if you have the ability to basically say β€œI’m gonna use a little bit of mod rewrite, and I’m gonna make it look like a static URL”, that can often be a very good way to tackle the problem. Page rank still flows, but experiment. If you don’t see any URLs that have the same structure or the same number of parameters as you’re thinking about doing, it’s probably better to either cut back on the number of parameters or shorten them somehow or try to use mod rewrite.

2.Β  I have a friend whose site was hacked, and he didn’t know about it for a couple of months because Google had taken it out or something like that. Can Google inform the webmaster of this occurrence? Basically, when your site gets hacked within Sitemaps, can Google inform someone that maybe inappropriate pages were crawled?

Matt guesses that Google doesn’t have the resources to do something like that right now. In general, when somebody’s hacked, if he has a small number of sites he’s monitoring he’ll usually notice it pretty quickly, or else the web host will alert him to it. The Sitemaps team is always willing to work on new things, but Matt guesses that this would be on the low end of the priority list.

3.Β  I’d like to use geo targeting software to deliver different marketing messages to different people in different parts of the world (ex. discounted pricing structure). Are we safe to run with this sort of plain vanilla use of geo targeting software? Clearly, we want to avoid any suspicions of cloaking.

The way that Google defines cloaking is very specific. Cloaking is defined asΒ  β€œshowing different content to users than you show to search engines.” Geo targeting by itself is not cloaking under Google’s guidelines because you’reΒ  saying “Take the ip address. Oh, you’re from Canada (or Germany, or whatever). We’ll show you this particular page.”Β 

The thing that will get you in trouble is if you treat Googlebot in some special way. If you’re geo targeting by country, don’t make a special country just for Googlebot (Matt says “Googlebotstan” as an example). Treat Googlebot just like a regular user. If you geo target by country or are coming from an ip address that’s in the United States, just give Googlebots whatever the United States users would see. Google, for example, does geo targeting, but it’s not considered to be cloaking. Just treat Googlebot like you would any other user based on the fact that they have this ip address, and you should be fine.

Because people joked that in Matt’s videos it looks like he’s been kidnapped (he had been previously answering questions in front of a blank wall), for the next series of videos he hung the closest thing to a map of the world up behind him. In this case, the closest thing to a map was a poster of β€œLanguage Families of the World.” Matt reads the map and then says, “Did you know that there are over 5,000 languages spoken across the earth? How many does Google support? Only about a hundred. Still a ways to go.”

4.Β  One of my clients is going to acquire a domain name very related to his business, and he has a lot of links going to it. He basically wants to do a 301 redirect to the final website after the acquisition. Will Google ban or apply a penalty for doing this 301 redirect?

In general, probably not. You should be okay because you specify that it’s very closely related. Any time there’s an actual merger of two businesses or two domains that are very close to each other, doing a 301 should be no problem whatsoever. If, however, you are a music site and all of a sudden you are acquiring links from debt consolidation, that could raise a few eyebrows. But it sounds like this is just a run of the mill sort of thing, so you should be okay.Β 

5.Β  What’s the best way to theme a site using directories? Do you put your main keyword in a directory or on the index page? If using directories, do you use a directory for each set of keywords?

Matt thinks that the question asker is thinking too much about keywords and not enough about site architecture. He prefers a treelike architecture so everything branches out in nice, even paths.

It’s also good if things are broken down by topic. If you’re selling clothes, you might have sweaters as one directory and shoes as another directory. If you do that sort of thing, your keywords do end up in directories.

As far as directories vs. the actual name of the html file, it doesn’t really matter that much within Google’s scoring algorithm. If you break it down by topic but make sure those topics match well with the keywords that you expect your users to type in when they try to find your page, then you should be in pretty good shape.Β 

6.Β  if an e commerce site’s URL has too many parameters and it is un-indexable, is it acceptable to use the Google guidelines to serve static html pages to the bot to index instead?

This is something to be very careful about, because if you’re not you could end up veering into cloaking. Again, cloaking is showing different content to users than to Googlebot. You want to show the exact same content to users as you do to Googlebot.Β 

Matt’s advice would be to go back to that question he previously answered about dynamic parameters in URLs. See if there’s a way to unify it so the users and Google both see the same directory. If you can do something like that, that’s going to be much better.

If not, you want to make sure that whatever HTML pages you do show, if users go to the same page they don’t get redirected. They need to see the exact same page that Googlebot saw. That’s the main criteria of cloaking, and that’s where you’ll have to be careful.

7.Β  I would like to use A/B split testing on my static HTML site. Will Google understand my PHP redirect for what it is, or will they penalize my site for perceived cloaking? If this is a problem, is there a better way to split test?

Matt suggests split testing in an area where search engines aren’t going to index it. Any time Google goes to a page and sees different content, or if they reload the page and see different content, that will look a little strange.Β 

If you can, it’s better to use robots.txt or htaccess files or something to make sure that Google doesn’t index your A/B testing. If not, Matt recommends not using a PHP redirect. He recommends using something server side to actually serve up the two pages in place.

The one thing to be careful about is not doing anything special for Googlebot. Just treat it like a regular user. That’s gonna be the safest thing in terms of not being treated like cloaking.

8.Β  Aw heck, how about a real question? Ginger or Mary Anne?Β 

“I’m gonna go Mary Anne.”

9.Β  Should I be worried about this? Site:tableandhome.com returns 10,000 results; site:tableandhome.com – intitle:buy returns 100,000 results, all supplemental.

In general, no, don’t worry about this. Matt then explains the concept of the beaten path. If there’s a problem with a one word search at Google, that’s a big deal. If it’s a 20 word search, that’s obviously less of a big deal because it’s off the beaten path. The supplemental results team takes reports very seriously and acts very quickly on them, but in general something in supplemental results is a little further off the beaten path than the main web results.

Once you start getting into negation or negation by a special operator, like intitle, etc., that’s pretty far off the beaten path. And you’re talking about results estimates, so not actual web results but the estimate for the number of the results.

The good news is there’s a couple things that will make our site: estimates more accurate. There’s at least two changes that Matt knows of in their infrastructure: One is deliberately trying to make site results more accurate, and the other one is just a change in the infrastructure to improve overall quality, but as a side benefit it will count the number of results from a site more accurately when it involves the supplemental results.

There are at least a couple changes that might make things more accurate, but in general once you start to get really far off the beaten path (-intitle, etc.), especially with supplemental results, don’t worry that much about the results estimates. Historically, Google hasn’t worried that much because not that many people have been interested. But they do hear more people expressing curiosity about the subject, so they’re putting a little more effort into it.Β 

10.Β  I have a question about redirects. I have one or more pages that have moved on various websites. I use classic ASP and [have been given a response of a 301]. These redirects have been set up for quite a while, but when I run a spider on them it handles the redirect fine.

This is probably an instance where you’re seeing this happen in the supplemental results. Matt posits that there’s a main web results Googlebot and there’s a supplemental results Googlebot. The next time a supplemental results Googlebot visits that page and sees the 301, it will index it accordingly and refresh and things will go fine.

Historically, the supplemental results have been a lot of extra data but have not been refreshed as fast as the main web results. If you do a cached page, anybody can verify that the results on the crawl dates vary. The good news is that the supplemental results are getting fresher and fresher, and there’s an effort underway to make them quite fresh.Β 

11.Β  I’d like to know more about the supplemental index. It seems while you were on vacation many sites got put there. I have one site where this happened. It has a page rank of 6, and it got put in the supplemental results since late May.

There is a new infrastructure in the supplemental results. Matt mentioned that on a blog post, and while he doesn’t know how many people have noticed it, he’s certainly said it before. (“I think it was in the indexing timeline, in fact.”)

As we refresh our supplemental results and start to use new indexing infrastructure in the supplemental results, the net effect is things will be a little fresher. Matt is sure that he has some URLs in the supplemental results, so he wouldn’t worry about it that much.

Over the course of the summer the supplemental results team will take all the different reports that they see, especially things off the beaten path, like site: and operators that are kind of esoteric, and they’ll be working on making sure that those return the sort of results that everybody naturally expects.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button